56 research outputs found

    Data Cube Approximation and Mining using Probabilistic Modeling

    Get PDF
    On-line Analytical Processing (OLAP) techniques commonly used in data warehouses allow the exploration of data cubes according to different analysis axes (dimensions) and under different abstraction levels in a dimension hierarchy. However, such techniques are not aimed at mining multidimensional data. Since data cubes are nothing but multi-way tables, we propose to analyze the potential of two probabilistic modeling techniques, namely non-negative multi-way array factorization and log-linear modeling, with the ultimate objective of compressing and mining aggregate and multidimensional values. With the first technique, we compute the set of components that best fit the initial data set and whose superposition coincides with the original data; with the second technique we identify a parsimonious model (i.e., one with a reduced set of parameters), highlight strong associations among dimensions and discover possible outliers in data cells. A real life example will be used to (i) discuss the potential benefits of the modeling output on cube exploration and mining, (ii) show how OLAP queries can be answered in an approximate way, and (iii) illustrate the strengths and limitations of these modeling approaches

    REPRESENTASI BENTUK TUMOR PAYUDARA DENGAN KODE RANTAI

    Get PDF
    Kanker payudara adalah penyakit penyebab kematian wanita kedua di dunia. Citra mamografi merupakan citra yang dapat digunakan sebagai alat bantu mendeteksi keberadaan penyakit tersebut. Keberadaan penyakit tersebut ditunjukkan melalui karakteristik objek tumor payudara yang tampak pada citra mamografi. Pada paper ini akan dikemukakan algoritma untuk merepresentasikan bentuk yang tampak pada citra mamografi sehingga dapat digunakan untuk analisis tumor payudara. Algoritma disusun tahap demi tahap diawali dengan memisahkan atau melokalisasi area yang dicurigai terdapat tumor payudara untuk mendapatkan Region of Interest (ROI), kemudian dilanjutkan dengan mendeteksi tepi objek (edge detection) tumor payudara, penipisan tepi objek (contour delimitation) dan representasi bentuk tumor payudara

    OLEMAR: An Online Environment for Mining Association Rules in Multidimensional Data

    Get PDF
    Data warehouses and OLAP (online analytical processing) provide tools to explore and navigate through data cubes in order to extract interesting information under different perspectives and levels of granularity. Nevertheless, OLAP techniques do not allow the identification of relationships, groupings, or exceptions that could hold in a data cube. To that end, we propose to enrich OLAP techniques with data mining facilities to benefit from the capabilities they offer. In this chapter, we propose an online environment for mining association rules in data cubes. Our environment called OLEMAR (online environment for mining association rules), is designed to extract associations from multidimensional data. It allows the extraction of inter-dimensional association rules from data cubes according to a sum-based aggregate measure, a more general indicator than aggregate values provided by the traditional COUNT measure. In our approach, OLAP users are able to drive a mining process guided by a meta-rule, which meets their analysis objectives. In addition, the environment is based on a formalization, which exploits aggregate measures to revisit the definition of the support and the confidence of discovered rules. This formalization also helps evaluate the interestingness of association rules according to two additional quality measures: lift and loevinger. Furthermore, in order to focus on the discovered associations and validate them, we provide a visual representation based on the graphic semiology principles. Such a representation consists in a graphic encoding of frequent patterns and association rules in the same multidimensional space as the one associated with the mined data cube. We have developed our approach as a component in a general online analysis platform called Miningcubes according to an Apriori-like algorithm, which helps extract inter-dimensional association rules directly from materialized multidimensional structures of data. In order to illustrate the effectiveness and the efficiency of our proposal, we analyze a real-life case study about breast cancer data and conduct performance experimentation of the mining process

    Décomposition sous-directe d'un treillis en facteurs irréductibles

    Get PDF
    National audienceLa taille d'un treillis de concepts peut augmenter de façon exponentielle avec la taille du contexte. Lorsque le nombre de noeuds devient important, l'´ etude et la génération d'un tel treillis devient impossible. Décomposer le treillis en petit sous-treillis est un moyen de contourner ceprobì eme. Dans la décomposition sous-directe, les petits sous-treillis générés sont des quotients qui ont une interprétation intéressante dans le cadre de l'Analyse de Concepts Formels. Dans cet article, nous présentons les etapes pour obtenir une décomposition sous-directe en treillis irréductibles , en partant d'un contexte fini et réduit. Cette décomposition est obtenue en utilisant trois points de vue : les treillis quotients, les relationsfì eches et les sous-contextes compatibles. Cette approche est essentiellement algébrique car elle repose sur la théorie des treillis, sauf pour le dernier point. Nous donnons un algorithme polynomial permettant de générer cette décomposition a partir d'un contexte initial. Cette méthode peut etré etendue pour permettre l'exploration interactive ou la fouille dans de grands contextes

    Formal Concept Analysis and Extensions for Complex Data Analytics

    No full text

    Using Taxonomies on Objects and Attributes to Discover Generalized Patterns

    No full text
    In this paper, we show how the existence of taxonomies on objects and/or attributes can be used in formal concept analysis to help discover generalized patterns in the form of concepts. To that end, we analyze three generalization cases and different scenarios of a simultaneous generalization on both objects and attributes. We also contrast the number of generalized patterns against the number of simple patterns

    Similarity-based Clustering versus Galois lattice building: Strengths and Weaknesses

    No full text
    In many real-world applications, designers tend towards building classes of objects such as concepts, chunks and clusters according to some similarity criteria. In this paper, we rst compare two approaches to clustering: the Galois lattice approach [14] and a similarity-based clustering approach [27]. Then, we sketch the possible ways each approach can bene t from the other in re ning the process of building a hierarchy of classes out of a set of instances.
    corecore